The Case for Efficient File Access Pattern Modeling
نویسندگان
چکیده
Most modern I/O systems treat each file access independently. However, events in a computer system are driven by programs. Thus, accesses to files occur in consistent patterns and are by no means independent. The result is that modern I/O systems ignore useful information. Using traces of file system activity we show that file accesses are strongly correlated with preceding accesses. In fact, a simple last-successor model (one that predicts each file access will be followed by the same file that followed the last time it was accessed) successfully predicted the next file 72% of the time. We examine the ability of two previously proposed models for file access prediction in comparison to this baseline model and see a stark contrast in accuracy and high overheads in state space. We then enhance one of these models to address the issues of model space requirements. This new model is able to improve an additional 10% on the accuracy of the last-successor model, while working within a state space that is within a constant factor (relative to the number of files) of the lastsuccessor model. While this work was motivated by the use of file relationships for I/O prefetching, information regarding the likelihood of file access patterns has several other uses such as disk layout and file clustering for disconnected operation.
منابع مشابه
An Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity
The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...
متن کاملE2DR: Energy Efficient Data Replication in Data Grid
Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...
متن کاملUsing the Adaptive Frequency Nonlinear Oscillator for Earning an Energy Efficient Motion Pattern in a Leg- Like Stretchable Pendulum by Exploiting the Resonant Mode
In this paper we investigate a biological framework to generate and adapt a motion pattern so that can be energy efficient. In fact, the motion pattern in legged animals and human emerges among interaction between a central pattern generator neural network called CPG and the musculoskeletal system. Here, we model this neuro - musculoskeletal system by means of a leg - like mechanical system cal...
متن کاملSimple, Accurate and Computationally Efficient Wireless Channel Modeling Algorithm
We propose simple and computationally efficient wireless channel modeling algorithm. For this purpose we adopt the special case of the algorithm initially proposed in [1] and show that its complexity significantly decreases when the time-series is covariance stationary binary in nature. We show that for such time-series the solution of the inverse eigenvalue problem returns unique transition pr...
متن کاملA Method for Protecting Access Pattern in Outsourced Data
Protecting the information access pattern, which means preventing the disclosure of data and structural details of databases, is very important in working with data, especially in the cases of outsourced databases and databases with Internet access. The protection of the information access pattern indicates that mere data confidentiality is not sufficient and the privacy of queries and accesses...
متن کامل